rank | frequency | n-gram |
---|---|---|
1 | 140255 | -e |
2 | 102502 | -r |
3 | 101885 | -n |
4 | 78407 | -t |
5 | 74431 | -s |
rank | frequency | n-gram |
---|---|---|
1 | 87443 | -er |
2 | 79394 | -en |
3 | 43241 | -et |
4 | 33657 | -ne |
5 | 26334 | -de |
rank | frequency | n-gram |
---|---|---|
1 | 22180 | -ing |
2 | 20866 | -rne |
3 | 14610 | -ter |
4 | 11470 | -gen |
5 | 10666 | -sen |
rank | frequency | n-gram |
---|---|---|
1 | 20476 | -erne |
2 | 9900 | -ning |
3 | 8519 | -ngen |
4 | 6639 | -nger |
5 | 6496 | -ende |
rank | frequency | n-gram |
---|---|---|
1 | 7691 | -ingen |
2 | 6046 | -inger |
3 | 3163 | -terne |
4 | 3121 | -elsen |
5 | 3057 | -ering |
The tables show the most frequent letter-N-grams at the ending of words for N=1…5. Everything runs in parallel to 2.2.5 Most frequent word beginnings. The aim is suffix detection instead of affix detection.
For N=3:
SELECT @pos:=(@pos+1), xx.* from (SELECT @pos:=0) r, (select count(*) as cnt ,concat("-", right(word,3)) FROM words WHERE w_id>100 group by right(word,3) order by cnt desc) xx limit 5;
2.2.5 Most frequent word beginnings